This document outlines Cross-Origin Read Blocking (CORB), an algorithm by which dubious cross-origin resource loads may be identified and blocked by web browsers before they reach the web page. CORB reduces the risk of leaking sensitive data by keeping it further from cross-origin web pages. In most browsers, it keeps such data out of untrusted script execution contexts. In browsers with Site Isolation, it can keep such data out of untrusted renderer processes entirely, helping even against side channel attacks.
The same-origin policy generally prevents one origin from reading arbitrary network resources from another origin. In practice, enforcing this policy is not as simple as blocking all cross-origin loads: exceptions must be established for web features, like <img>
or <script>
which can target cross-origin resources for historical reasons, and for the CORS mechanism which allows some resources to be selectively read across origins.
Certain types of content, however, can be shown to be incompatible with all of the historically-allowed permissive contexts. JSON is one such type: a JSON response will result in a decode error when targeted by the <img>
tag, either a no-op or syntax error when targeted by the <script>
tag, and so on. The only case where a web page can load JSON with observable consequences, is via fetch()
or XMLHttpRequest
; and in those cases, cross-origin reads are moderated by CORS.
By detecting and blocking loads of CORB-protected resources early -- that is, before the response makes it to the image decoder or JavaScript parser stage -- CORB defends against side channel vulnerabilities that may be present in the stages which are skipped.
CORB mitigates the following attack vectors:
Cross-Site Script Inclusion (XSSI)
<script>
tag at a target resource which is not JavaScript, and observing some side effects when the resulting resource is interpreted as JavaScript. An early example of this attack was discovered in 2006: by overwriting the JavaScript Array constructor, the contents of JSON lists could be intercepted as simply as: <script src="https://example.com/secret.json">
. While the array constructor attack vector is fixed in current browsers, numerous similar exploits have been found and fixed in the subsequent decade. For example, see the slides here.<script>
element.Speculative Side Channel Attack (e.g. Spectre).
<img src="https://example.com/secret.json">
element to pull a cross-site secret into the process where the attacker's JavaScript runs, and then use a speculative side channel attack (e.g. Spectre) to read the secret.When CORB decides that a response needs to be CORB-protected, the response is modified as follows:
[lukasza@chromium.org] Chromium currently retains Access-Control-* headers (this helps generate better error messages for CORS).
To be effective against speculative side-channel attacks, CORB blocking must take place before the response reaches the process hosting the cross-origin initiator of the request. In other words, CORB blocking should prevent CORB-protected response data from ever being present in the memory of the process hosting a cross-origin website (even temporarily or for a short term). This is different from the concept of filtered responses (e.g. CORS filtered response or opaque filtered response) which just provide a limited view into full data that remains stored in an internal response and may be implemented inside the renderer process.
A CORB demo page is available here.
The following kinds of requests are CORB-exempt:
<iframe>
s, <object>
s, and <embed>
s create a separate security context and thus pose less risk for leaking the data. In most browsers, this separate context means that a malicious page would have more trouble inferring the contents than from loading them into its own execution context and observing side effects (e.g., XSSI, style tags, etc). In browsers with Site Isolation, this security context uses a separate process, keeping the data out of the malicious page's address space entirely.[lukasza@chromium.org] TODO: Figure out how Edge's VM-based isolation works (e.g. if some origins are off-limits in particular renderers, then this would greatly increase utility of CORB in Edge).
[lukasza@chromium.org] AFAIK, in Chrome a response to a download request never passes through memory of a renderer process. Not sure if this is true in other browsers.
All other kinds of requests may be CORB-eligible. This includes:
ping
, navigator.sendBeacon()
<link rel="prefetch" ...>
<img>
tag, /favicon.ico
, SVG‘s <image>
, CSS’ background-image
, etc.<script>
, importScripts()
, navigator.serviceWorker.register()
, audioWorklet.addModule()
, etc.The essential idea of CORB is to consider whether a particular resource might be unsuitable for use in every context listed above. If every possible usage would result in either a CORS error, a syntax/decoding error, or no observable consequence, CORB ought to be able to block the cross-origin load without changing the observable consequences of the load. Prior to CORB, details are already suppressed from cross-origin errors, to prevent information leaks. Thus, the observable consequences of such errors are already limited, and feasible to preserve while blocking.
As discussed below, the following types of content are CORB-protected:
These are each discussed in the following sections.
JSON is a widely used data format on the web; support for JSON is built into the web platform. JSON responses are very likely to contain user data worth protecting. Additionally, unlike HTML or image formats, there are no legacy HTML mechanisms (that is, predating CORS) which allow cross-origin embedding of JSON resources.
Because the JSON syntax is derived from and overlaps with JavaScript, care must be taken to handle the possibility of JavaScript/JSON polyglots. CORB handles the following cases for JSON:
Non-empty JSON object literal: A non-empty JSON object (such as {"key": "value"}
). This is precisely the subset of JSON syntax which is invalid JavaScript syntax -- the colon after the first string literal will generate a syntax error. CORB can protect these cases, even if labeled with a different Content-Type, by sniffing the response body.
Other JSON literals: The remaining subset of the JSON syntax (for example, null
or [1, 2, "3"]
) also happens to be valid JavaScript syntax. In particular, when evaluated as script, they are value expressions that should have no side effects. Thus, if they can be detected, they can be CORB- protected. Detection here is possible, but requires implementing a validator that understands the full JSON syntax:
[1, 2, "3"].map(...)
.JSON served with an XSSI-defeating prefix: As a mitigation for past browser vulnerabilities, many actual websites and frameworks employ a convention of prefixing their fetchable resources with a string designed to force a JavaScript error. These prefixes have not been standardized prior to CORB, but a few approaches seem prevalent:
)]}'
is built into the angular.js framework, the Java Spring framework, and is observed in wide use on the google.com domain.{} &&
was historically built into the Java Spring framework.for(;;);
, are observed in wide use on the facebook.com domain.The presence of these recognized XSSI defenses is a strong signal to the CORB algorithm that a resource should be CORB-protected. As such, these prefixes should trigger CORB protection in almost every case, no matter what follows them. This is argued to be safe because:
text/javascript
.FF D8 FF
for image/jpeg).text/css
stylesheets are theoretically possible, because it is possible to construct a file that begins with a JSON security prefix, but at the same parses fine as a stylesheet. text/css
is therefore established as an exception, even though the practical likelihood of such a scenario seems low. See below for an example of such a stylesheet:)]}' {} h1 { color: red; }
JSON is also used by some web features. One example is <link rel="manifest">
, whose href
attribute specifies a JSON manifest file. Fortunately, this mechanism requires CORS when the manifest is specified cross- origin, so its CORB treatment works identically to the rules applied to fetch().
[nick@chromium.org] TODO: Is there a spec link for JSON being side-effect free when interpreted as script?
HTML can be embedded cross-origin via <iframe>
(as noted above), but otherwise HTML documents can only be loaded by fetch() and XHR, both of which require CORS. HTML sniffing is already well-understood, so (unlike JSON) it is relatively easy to identify HTML resources with high confidence.
One ambiguous polyglot case has been identified that CORB needs to handle conservatively: HTML-style comments, which are part of the JavaScript syntax.
<!--
” string doesn't immediately confirm that the sniffed resource is a HTML document - the HTML comment still has to be followed by a valid HTML tag.SingleLineHTMLCloseComment
rule which can consume SingleLineCommentChars
after the “-->
” characters.Examples of html/javascript polyglots which have been observed in use on real websites:
<!--/*--><html><body><script type="text/javascript"><!--//*/ var x = "This is both valid html and valid javascript"; //--></script></body></html>
<!-- comment --> <script type='text/javascript'> //<![CDATA[ var x = "This is both valid html and valid javascript"; //]]>--></script>
XML, like JSON, is a widely used data exchange format, and like HTML, is a document format that's built into the web platform (notably via XmlHttpRequest).
Confirming an XML content-type via sniffing is more straightforward than JSON or HTML: XML is signified by the pattern <?xml
, possibly preceded by whitespace.
The only identified XML case that requires special treatment by CORB is image/svg+xml
, which is an image type. All other XML mime types are treated as CORB-protected.
CORB decides whether a response needs protection (i.e. if a response is a JSON, HTML or XML resource) based on the following:
If the response contains X-Content-Type-Options: nosniff
response header, then the response will be CORB-protected if its Content-Type
header is one of the following:
image/svg+xml
which is CORB-exempt as described above)text/plain
If the response is a 206 response, then the response will be CORB-protected if its Content-Type
header is one of the following:
image/svg+xml
which is CORB-exempt as described above)Otherwise, CORB attempts to sniff the response body:
image/svg+xml
) that sniffs as XML is CORB-protectedtext/plain
that sniffs as JSON, HTML or XML is CORB-protectedtext/css
) that begins with a JSON security prefix is CORB-protectedThe sniffing is necessary to avoid blocking existing web pages that depend on mislabeled cross-origin responses (e.g. on images served as text/html
). Without sniffing CORB would block around 16 times as many responses.
Content-Type
header (i.e. if the Content-Type
header is text/json
then CORB will sniff for JSON and will not sniff for HTML or XML).Content-Type
header and 2) opt out of sniffing by using the X-Content-Type-Options: nosniff
header.[nick@chromium.org] This section needs a strong justification for why text/plain gets this special interpretation. Ideally data showing that text/plain is commonly used to serve HTML, JSON, or XML. Treatment of text/plain in our current implementation may actually be an artifact of an earlier prototype, which ran after standard mime sniffing, and may have seen ‘text/plain’ MIME types applied as a default MIME type when the response omitted a Content-Type header.
Note that the above means that the following responses are not CORB-protected:
multipart/*
. This avoids having to parse the content types of the nested parts. We recommend not supporting multipart range requests for sensitive documents.Content-Type
header.text/javascript
. This includes JSONP (“JSON with padding”) which unlike JSON is meant to be read and executed in a cross-origin context.CORB should have no observable impact on <img>
tags unless the image resource is both 1) mislabeled with an incorrect, non-image, CORB-protected Content-Type and 2) served with the X-Content-Type-Options: nosniff
response header.
Examples:
Correctly-labeled HTML document
<img>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/img-html-correctly-labeled.sub.html
Mislabeled image (with sniffing)
<img>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/img-png-mislabeled-as-html.sub.html
Mislabeled image (nosniff)
<img>
tag:Content-Type: text/html
X-Content-Type-Options: nosniff
nosniff
header, CORB will have to rely on the Content-Type
header. Because this response is mislabeled (the body is an image, but the Content-Type
header says that it is a html document), CORB will incorrectly classify the response as requiring CORB-protection.fetch/corb/img-png-mislabeled-as-html-nosniff.tentative.sub.html
In addition to the HTML <img>
tag, the examples above should apply to other web features that consume images - including, but not limited to:
/favicon.ico
<image>
,<link rel="preload" as="image" ...>
(see WPT test: fetch/corb/preload-image-png-mislabeled-as-html-nosniff.tentative.sub.html
)background-image
in stylesheets<canvas>
[lukasza@chromium.org] Earlier attempts to block nosniff images with incompatible MIME types failed. We think that CORB will have more luck, because it will only block a subset of CORB-protected MIME types (e.g. it won't block
application/octet-stream
as quoted in a Firefox bug)
Audio and video resources should see similar impact as images, though 206 responses are more likely to occur for media.
CORB should have no observable impact on <script>
tags except for cases where a CORB-protected, non-JavaScript resource labeled with its correct MIME type is loaded as a script - in these cases the resource will usually result in a syntax error, but CORB-protected response's empty body will result in no error.
Examples:
<script>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/script-html-correctly-labeled.tentative.sub.html
[lukasza@chromium.org] In theory, using a non-empty response in CORB-blocked responses might reintroduce the lost syntax error. We didn't go down that path, because
- using a non-empty response would be inconsistent with other parts of the Fetch spec (like opaque filtered response).
- retaining the presence of the syntax error might require changing the contents of a CORB-blocked response body depending on whether the original response body would have caused a syntax error or not. This would add extra complexity that seems undesirable both for CORB implementors and for web developers.
Mislabeled script (with sniffing)
<script>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/script-js-mislabeled-as-html.sub.html
Mislabeled script (nosniff)
<script>
tag:Content-Type: text/html
X-Content-Type-Options: nosniff
nosniff
response header response will cause the response to be blocked when its MIME type (text/html
in the example) is not a JavaScript MIME type (this behavior is required by the Fetch spec).fetch/corb/script-js-mislabeled-as-html-nosniff.sub.html
In addition to the HTML <script>
tag, the examples above should apply to other web features that consume JavaScript including script-like destinations like importScripts()
, navigator.serviceWorker.register()
, audioWorklet.addModule()
, etc.
CORB should have no observable impact on stylesheets.
Examples:
Anything not labeled as text/css
Examples of resources used in a <link rel="stylesheet" href="...">
tag:
Content-Type: text/html
X-Content-Type-Options
headerExpected behavior: no observable difference. Even without CORB, such stylesheet examples will be rejected, because due to the relaxed syntax rules of CSS, cross-origin CSS requires a correct Content-Type header (restrictions vary by browser: IE, Firefox, Chrome, Safari (scroll down to CVE-2010-0051) and Opera). This behavior is covered by the HTML spec which 1) asks to only assume text/css
Content-Type if the document embedding the stylesheet has been set to quirks mode and has the same origin and 2) only asks to run the steps for creating a CSS style sheet if Content-Type of the obtained resource is text/css
.
WPT tests: fetch/corb/style-css-mislabeled-as-html.sub.html
, fetch/corb/style-html-correctly-labeled.sub.html
Anything not labeled as text/css (nosniff)
<link rel="stylesheet" href="...">
tag:Content-Type: text/html
X-Content-Type-Options: nosniff
nosniff
response header response will cause the response to be blocked when its MIME type (text/html
in the example) is not text/css
(this behavior is required by the Fetch spec).fetch/corb/style-css-mislabeled-as-html-nosniff.sub.html
Correctly-labeled stylesheet with a JSON security prefix
<link rel="stylesheet" href="...">
tag:Content-Type: text/css
X-Content-Type-Options
headerContent-Type: text/css
.fetch/corb/style-css-with-json-parser-breaker.sub.html
CORB has no impact on the following scenarios:
Tracking and reporting
img
element to a HTTP URI that usually replies either with a 204 or with a short HTML document. In addition to the img
tag, websites may use style
, script
and other tags to track usage.Service workers
Blob and File API
script-html-via-cross-origin-blob-url.sub.html
(and also tests for navigation requests covered by the commit here).Content scripts and plugins
CORB has been enabled in optional Site Isolation modes and field trials, and Chromium has been instrumented to count how many CORB-eligible responses are blocked. (CORB-eligible responses exclude navigation requests and downloads; see the “What kinds of requests are CORB-eligible?” section above.) Our analysis of the initial data from Chrome Canary in February 2018 shows a low upper bound on the number of cases observable to web pages, with possibilities to further lower the bounds.
Overall, 0.961% of all CORB-eligible responses are blocked. However, over half of these are empty responses already (i.e., actually have a Content-Length: 0
response header), and thus cause effectively no behavior change (i.e., only non-safelisted headers would be affected). Note that if sniffing were omitted, almost 20% of responses would be blocked, so sniffing is a clear necessity.
Looking closer, 0.456% of all CORB-eligible responses are non-empty and blocked. However, most of these cases fall into the non-observable categories described in the subsections above, such as HTML responses being delivered to image tags as tracking pixels.
We can focus on two groups of blocked responses which may have observable impact.
[creis@chromium.org] We are considering lowering this bound further by sniffing these responses to confirm how many might contain actual images.
Another 3.76% of these are range requests for text/plain from a media context. We have not yet found examples in practice, but we are considering allowing range request responses for text/plain to avoid disruption here.
0.014% of all CORB-eligible responses were invalid inputs to script tags, since CORB sniffing revealed they were HTML, XML, or JSON. Again, this is specific to non-empty responses that do not have a 204 status code. These cases should have minimal risk of disruption in practice (e.g., more than half have error status codes and likely represent broken links), but it is technically possible to observe a difference based on whether a syntax error is reported.
These numbers of affected cases are sufficiently low to suggest that CORB is promising from a web compatibility perspective.
The currently proposed version of CORB only protects JSON, HTML and XML resources - other sensitive resources need to be protected in some other way. One possible approach is to protect such resources via unguessable XSRF tokens which are distributed via JSON (which is CORB-protected).
In the future CORB may be extended to protect additional resources as follows:
Covering more MIME types. Instead of blocklisting HTML, XML, and JSON, CORB protection can be extended to all MIME types, except MIME types that are allowlisted as usable in <img>
, <audio>
, <video>
, <script>
and other similar elements that can be embedded cross-origin:
text/javascript
, application/javascript
, or text/jscript
text/css
image/*
audio/*
, video/*
or application/ogg
font/*
or one of legacy font typesapplication/octet-stream
, text/vttThis extension would offer CORB-protection to resources like PDFs or ZIP files. CORB would not perform confirmation sniffing for MIME types other than HTML, XML and JSON (since it is not practical to teach CORB sniffer about all the possible MIME types). On the other hand, the value of confirmation sniffing for these other MIME types seems low, since mislabeling content as such types seems less likely than for example mislabeling as text/html
.
[lukasza@chromium.org] See also https://github.com/whatwg/fetch/issues/721
[lukasza@chromium.org] Currently considered CORB opt-in signals include:
From-Origin:
orCross-Origin-Resource-Policy:
header - see https://github.com/whatwg/fetch/issues/687Isolate-Me
header - see https://github.com/WICG/isolation
The CORB section in the Fetch spec covers handling of nosniff
and 206 responses since May 2018.
CORB confirmation sniffing is not standardized yet.
Some aspects of CORB are under discussion and may evolve over time.
Tracking bugs:
Status of Web Platform Tests: